17. EKF

Multi-dimensional Extended Kalman Filter

Now you’ve seen the fundamentals behind the Extended Kalman Filter. The mechanics are not too different from the Kalman Filter, with the exception of needing to linearize a nonlinear motion or measurement function to be able to update the variance.

You’ve seen how this can be done for a state prediction or measurement function that is of one-dimension, but now it’s time to explore how to linearize functions with multiple dimensions. To do this, we will be using multi-dimensional Taylor series.

Linearization in Multiple Dimensions

The equation for a multidimensional Taylor Series is presented below.

\large T(x) = f(a) + (x-a)^TDf(a) + \frac{1}{2!}(x-a)^TD^2f(a)(x-a)+...

You will see that it is very similar to the 1-dimensional Taylor Series. As before, to calculate a linear approximation, we only need the first two terms.

\large T(x) = f(a) + (x-a)^TDf(a)

You may notice a new term, Df(a). This is the Jacobian matrix, and it holds the partial derivative terms for the multi-dimensional equation.

\large Df(a) = \frac{\delta f(a)}{\delta x}

In it's expanded form, the Jacobian is a matrix of partial derivatives. It tells us how each of the components of f changes as we change each of the components of the state vector.

\large Df(a) = \begin{bmatrix} \frac{\delta f_1}{\delta x_1} & \frac{\delta f_1}{\delta x_2} & \ldots & \frac{\delta f_1}{\delta x_n} \\ \frac{\delta f_2}{\delta x_1} & \frac{\delta f_2}{\delta x_2} & \ldots & \frac{\delta f_2}{\delta x_n}\\ \vdots & \vdots & \ddots & \vdots\\ \frac{\delta f_m}{\delta x_1} & \frac{\delta f_m}{\delta x_2} & \ldots & \frac{\delta f_m}{\delta x_n} \end{bmatrix}

The rows correspond to the dimensions of the function, f, and the columns relate to the dimensions (state variables) of x. The first element of the matrix is the first dimension of the function derived with respect to the first dimension of x.

The Jacobian is a generalization of the 1-dimensional case. In a 1-dimensional case, the Jacobian would have df/dx as its only term.

Example Application

This will make more sense in context, so let’s look at a specific example. Let’s say that we are tracking the x-y coordinate of an object. This is to say that our state is a vector x, with state variables x and y.

\large x = \begin{bmatrix} x \\ y \end{bmatrix}

However, our sensor does not allow us to measure the x and y coordinates of the object directly. Instead, our sensor measures the distance from the robot to the object, r, as well as the angle between r and the x-axis, θ.

\large z = \begin{bmatrix} r \\ \theta \end{bmatrix}

It is important to notice that our state is using a Cartesian representation of the world, while the measurements are in a polar representation. How will this affect our measurement function?

Our measurement function maps the state to the observation, as so,

\large \begin{bmatrix} x \\ y \end{bmatrix} \xRightarrow{meas. function} \begin{bmatrix} r \\ \theta \end{bmatrix}

Thus, our measurement function must map from Cartesian to polar coordinates. But there is no matrix, H, that will successfully make this conversion, as the relationship between Cartesian and polar coordinates is nonlinear.

\large r = \sqrt{x^2 + y^2}
\large \theta = tan^{-1}(\frac{y}{x})

For this reason, instead of using the measurement residual equation y = z - Hx' that you had seen before, the mapping must be made with a dedicated function, h(x').

\large h(x') = \begin{bmatrix} \sqrt{x^2 + y^2} \\ tan^{-1}(\frac{y}{x}) \end{bmatrix}

Then the measurement residual equation becomes y = z - h(x') .

Our measurement covariance matrix cannot be updated the same way, as it would turn into a non-Gaussian distribution (as seen in the previous video). Let's calculate a linearization, H, and use it instead. The Taylor series for the function h(x), centered about the mean μ, is defined below.

\large h(x) \simeq h(\mu) + (x-\mu)^TDf(\mu)

The Jacobian, Df(\mu) , is defined below. But let's call it H since it's the linearization of our measurement function, h(x).

\Large H = \begin{bmatrix} \frac{\partial r}{\partial x} & \frac{\partial r}{\partial y} \\ \frac{\partial \theta}{\partial x} & \frac{\partial \theta}{\partial y}\end{bmatrix}

If you were to compute each of those partial derivatives, the matrix would reduce to the following,

\Large H = \begin{bmatrix} \frac{x}{\sqrt[]{x^2 + y^2}} & \frac{y}{\sqrt[]{x^2 + y^2}}\\ -\frac{y}{x^2 + y^2} & \frac{x}{x^2 + y^2} \end{bmatrix}

It's this matrix, H, that can then be used to update the state's covariance.

To summarize the flow of calculations for the Extended Kalman Filter, it's worth revisiting the equations to see what has changed and what has remained the same.

Extended Kalman Filter Equations

These are the equations that implement the Extended Kalman Filter - you'll notice that most of them remain the same, with a few changes highlighted in red.

State Prediction:

\large \color{red} \cancel{x' = Fx} \quad \color{black} \rightarrow \quad x' = f(x)
\large \color{black} P' = \color{blue}F\color{black}P\color{blue}F^T\color{black} + Q

Measurement Update:

\large \color{red} \cancel{y = z - Hx'}\quad \color{black} \rightarrow \quad y = z -h(x')
\large \color{black} S = \color{blue}H\color{black}P' \color{blue}H^T\color{black} + R

Calculation of Kalman Gain:

\large \color{black} K = P' \color{blue}H^T\color{black}S^{-1}

Calculation of Posterior State and Covariance:

\large \color{black} x = x' + Ky
\large \color{black} P = (I - K \color{blue}H\color{black})P'

Highlighted in blue are the Jacobians that replaced the measurement and state transition functions.

The Extended Kalman Filter requires us to calculate the Jacobian of a nonlinear function as part of every single iteration, since the mean (which is the point that we linearize about) is updated.

Summary

Phew, that got complicated quickly! Here are the key take-aways about Extended Kalman Filters:

  • The Kalman Filter cannot be used when the measurement and/or state transition functions are nonlinear, since this would result in a non-Gaussian distribution.

  • Instead, we take a local linear approximation and use this approximation to update the covariance of the estimate. The linear approximation is made using the first terms of the Taylor Series, which includes the first derivative of the function.

  • In the multi-dimensional case, taking the first derivative isn't as easy as there are multiple state variables and multiple dimensions. Here we employ a Jacobian, which is a matrix of partial derivatives, containing the partial derivative of each dimension with respect to each state variable.

While it's important to understand the underlying math to employ the Kalman Filter, don't feel the need to memorize these equations. Chances are, whatever software package or programming language you're working with will have libraries that allow you to apply the Kalman Filter, or at the very least perform linear algebra calculations (such as matrix multiplication and calculating the Jacobian).